首页> 外文OA文献 >BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth
【2h】

BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth

机译:BB8:一种可扩展,准确,可靠的部分遮挡方法   在不使用深度的情况下预测具有挑战性的物体的3D姿态

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We introduce a novel method for 3D object detection and pose estimation fromcolor images only. We first use segmentation to detect the objects of interestin 2D even in presence of partial occlusions and cluttered background. Bycontrast with recent patch-based methods, we rely on a "holistic" approach: Weapply to the detected objects a Convolutional Neural Network (CNN) trained topredict their 3D poses in the form of 2D projections of the corners of their 3Dbounding boxes for the pose of objects' parts. This, however, is not sufficientfor handling objects from the recent T-LESS dataset: These objects exhibit anaxis of rotational symmetry, and the similarity of two images of such an objectunder two different poses makes training the CNN challenging. We solve thisproblem by restricting the range of poses used for training, and by introducinga classifier to identify the range of a pose at run-time before estimating it.We also use an optional additional step that refines the predicted poses forhand pose estimation. We improve the state-of-the-art on the LINEMOD datasetfrom 73.7% to 89.3% of correctly registered RGB frames. We are also the firstto report results on the Occlusion dataset using color images only. We obtain54% of frames passing the Pose 6D criterion on average on several sequences ofthe T-LESS dataset, compared to the 67% of the state-of-the-art on the samesequences which uses both color and depth. The full approach is also scalable,as a single network can be trained for multiple objects simultaneously.
机译:我们介绍了一种仅从彩色图像进行3D对象检测和姿势估计的新颖方法。我们首先使用分割来检测2D感兴趣的对象,即使在存在部分遮挡和背景混乱的情况下也是如此。与最近的基于补丁的方法相反,我们依靠“整体”方法:将经过训练的卷积神经网络(CNN)应用于检测到的对象,以其3D边界框角的2D投影形式预测其3D姿势。对象的部分。但是,这不足以处理最近的T-LESS数据集中的对象:这些对象表现出旋转对称轴,并且在两个不同姿势下该对象的两个图像的相似性使训练CNN具有挑战性。我们通过限制用于训练的姿势范围并引入分类器来在估计运行时间之前在运行时识别姿势的范围来解决此问题,还使用了一个可选的附加步骤来完善预测的姿势以进行手动姿势估计。我们将LINEMOD数据集上的最新技术从正确注册的RGB帧的73.7%提高到89.3%。我们也是第一个仅使用彩色图像报告遮挡数据集结果的公司。在T-LESS数据集的多个序列上,我们平均获得54%通过Pose 6D准则的帧,而在使用颜色和深度的相同序列上,最新技术的这一比例为67%。完整的方法也是可扩展的,因为单个网络可以同时针对多个对象进行训练。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号